Optimal Hidden Markov Models for All Sequences of Known Structure

نویسندگان

  • Julian Gough
  • Cyrus Chothia
  • Kevin Karplus
  • Christian Barrett
  • Richard Hughey
چکیده

Hidden Markov Models (HMMs) are probably the most powerful tool for the detection of protein sequence homology [4]. Maximization of their capabilities and biological usefulness requires the correct interpretation of their scores, and sufficient coverage of the sequence variations that exist in different protein families. Using information available from the SCOP database we investigated optimal measures for these two aspects of HMMs [5]. The SCOP database is a hierarchical classification of the domains that are found in all proteins of known structure. Those evolutionary relationships of proteins which cannot be detected from sequence can usually be inferred from structure. These are listed in the SCOP database where superfamilies contain sets of proteins with evolutionary relationships derived from both sequence and structure. Searches performed using a set of HMMs produced for each member of a superfamily of homologous sequences worked more effectively than a single HMM built from an alignment of those sequences. Using the SAM-Target99 procedure (see below), HMM models were made for all PDB95 sequences, that is the 4591 sequences of proteins of known structure which have sequence identities of 95% or less. These sequences in the context of their SCOP classification form the basis of the results described below.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introducing Busy Customer Portfolio Using Hidden Markov Model

Due to the effective role of Markov models in customer relationship management (CRM), there is a lack of comprehensive literature review which contains all related literatures. In this paper the focus is on academic databases to find all the articles that had been published in 2011 and earlier. One hundred articles were identified and reviewed to find direct relevance for applying Markov models...

متن کامل

Comparing the Bidirectional Baum-Welch Algorithm and the Baum-Welch Algorithm on Regular Lattice

A profile hidden Markov model (PHMM) is widely used in assigning protein sequences to protein families. In this model, the hidden states only depend on the previous hidden state and observations are independent given hidden states. In other words, in the PHMM, only the information of the left side of a hidden state is considered. However, it makes sense that considering the information of the b...

متن کامل

A generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences

The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000